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AMIDASE 

This invention reiares to newly identified 
polynuc-eotides / polypeptides encoded by such 
5 polynucleotides, the use of such polynucleotides and 
polypepcides , as well as the production and isolation of 
such polynucleotides and polypeptides. iMore 
particularly, the polypeptide of the present invention 
has been identified as an amidase and in particular an 
10 enzyme having activity in the removal of arginine, 

phenylalanine or methionine from the N-terminal end of 
peptides in peptide or peptidomimetic synthesis. 



Thermophilic bacteria have received considerable 
attention as sources of highly active and thermostable 

15 enzymes (Bronneoineier , K. and Staudenbauer ^ W.I., D.R. 
Woods (Ed.), The Clostridia and Biotechnology, 
Butterworth Publishers, Stoneham, MA (1993). Recently, 
the most extremely thermophilic organotrophic eubacteria 
presently known have been isolated and characterized. 

20 These bacteria, which belong to Che genus Thermotoga, are 
fermentative microorganisms metabolizing a variety of 
carbohydrates (Ruber, R. and Stetter, K.O., in Ballows, 
et al., (Ed.)/ The Procaryotes, 2nd Ed., Spr inger-Verlaz , 
New York, pgs . 3809-3819 (1992)). 

25 Because to date most organisms identified from the 

archaeal domain are thermophiles or hyperthermophiles , 
archaeal bacteria are also considered a fertile source of 
thermophilic enzymes , 
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SUMMARY OF THE INVENTION 



In accordance with one aspec^ of the present 
invention, there is provided a novel enzyme^ as well as 
active fragments, analogs and derivatives thereof. 

5 In accordance with another aspect, of the present 

invention, there are provided isolated nucleic acid 
molecules encoding an enzyme of the present invention 
including mRNAs, DNAs, cDNAs^ genomic DNAs as well as 
active analogs and fragments of such enzymes, 

10 In accordance with yet a further aspect of the 

present invention, there is provided a process for 
producing such polypeptide by recombinant techniques 
comprising cuituring recombinant prokaryotic and/or 
eukaryotic host cells, containing a nucleic acid sequence 

15 encoding an enzyme of the present invention, under 
conditions promoting expression of said enzyme and 
subsequent recovery of said enzyme. 

In accordance with yet a further aspect of the 
present invention, there is provided a process for 

20 utilizing such enzyme, or polynucleotide encoding such 
enzyme. The enzyme is useful for the removal of 
arginine, phenylalanine, or methionine amino acids from 
the N-terminai end of peptides in peptide or 
peptidomimet ic synthesis. The enzyme is selective for 

25 the L, or "natural" enantiomer of the amino acid 

derivatives and is therefore useful for the production of 
optically active compounds. These reactions can be 
performed in the presence or the chemically more reactive 
ester functionality, a step which is very difficult to 
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achieve with nonenzymat ic methods. The enzy.Tie is also 
able to tolerate high temperatures (at least 70°C) , and 
high concentrations of organic solvents (>40% DMSO) , both 
of which cause a disruption of secondary structure m 
5 peptides; this enables cleavage of otherwise resistant 
bonds . 

In accordance with yet a further aspect of the 
present invention, there is also provided nucleic acid 
probes comprising nucleic acid molecules of sufficient 
10 length to specifically hybridize to a nucleic acid 
sequence of the present invention. 

In accordance with yet a further aspect of the 
present invention, there is provided a process for 
utilizing such enzymes, or polynucleotides encoding such 
15 enzymes, for in vitro purposes related to scientific 

research, for example, to generate probes for identifying 
similar sequences which might encode similar enzym.es from 
.other organisms. 

These and other aspects of the present invention 
20 should be apparent to those skilled in the art from the 
teachings herein. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The following drawings are illustrative of 
embodiments of zhe invention and are noc meant to limit 
the scope of the invention as ercompassed by the claims. 

5 Figure 1 is an illustration of the full-length DNA 

and corresponding deduced amino acid sequence of the 
enzyme of the present invention. Sequencing was 
performed using a 378 automated DNA sequencer (Applied 
Biosystems, Inc , ) . 

10 Figure 2 shows the fluorescence versus 

concentration of DMSO. The filled and open boxes 
represent individual assays from Example 3* 

Figure 3 shows the relative initial linear rates 
(increase in fluorescence per min. i.e, "activity") 
15 versus concentration of DMF for the more reactive CBZ-L- 
arg-AMC; from Example 3. 
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DETAILED DESCRIPTION OF THE INVENTION 

The term "gene" means the segment of DMA involved 
in producing a polypeptide chain; it includes regions 
preceding and following the coding region (leader and 
5 trailer) as well as intervening sequences (introns) 
between individual coding segments (exons) . 

A coding sequence is "operably linked to" another 
coding sequence when RNA polymerase will transcribe the 
two coding sequences into a single mRNA; which is then 
D translated into a single polypeptide having amino acids 
derived from both coding sequences. The coding sequences 
need not be contiguous to one another so long as the 
expressed sequences are ultimately processed to produce 
the desired protein. 

3 "Recombinant" enzymes refer to enzymes produced by 

recombinant DNA techniques; i.e., produced from cells 
transformed by an exogenous DNA construct encoding the 
desired enzym.e. "Synthetic" enzymes are those prepared 
by chemical synthesis. 

D The present invention provides substantially pure 

amidase enzymes. The term "substantially pure" is used 
herein to describe a molecule, such as a polypeptide 
(e.g., an amidase polypeptide, or a fragment thereof) 
that is substantially free of other proteins, lipids, 

5 carbohydrates, nucleic acids, and other biological 
materials with which it is naturally associated. For 
example, a substantially pure molecule, such as a 
polypeptide, can be at least 60%, by dry weight, the, 
molecule of interest. The purity of the polypeptides can 
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be determined using standard methods including, e.g,r 
polyacrylamide gel electrophoresis [e.g., SDS-PAGE), 
column chromatography (e.g., hiah performance liquid 
chromatography (HPLC) ) , and amino- terminal amino acid 
5 sequence analysis. 

A DNA "coding sequence of" or a "nucleotide 
sequence encoding" a particular enzyme, is a DNA sequence 
which is transcribed and translated into an enzyme when 
placed under the control of appropriate regulatory 
10 sequences. A "promotor sequence" is a DNA regulatory 
region capable of binding RNA polymerase in a cell and 
initiating transcription of a downstream (3' direction) 
coding sequence. The promoter is part of the DNA 
sequence. This sequence region has a start codon at its 
3' term.inus. The promoter sequence does include the 
minimum number of bases where elements necessary to 
initiate transcription at levels detectable above 
background. However, after the RNA polymerase binds the 
sequence and transcription is initiated at the start 
codon (3' terminus with a promoter), transcription 
proceeds downstream m the 3' direction. Within the 
promotor sequence will be found a transcription 
initiation site (conveniently defined by mapping with 
nuclease SI) as well as protein binding domains 
(consensus sequences) responsible for the binding of RNA 
polymerase . 

The present invention provides a purified 
thermostable enzyme that catalyzes the removal of 
arginine, phenylalanine, or methionine amino acids from 
the N-terminal end of peptides in peptide or 
peptidomimetic synthesis. The purified enzyme is an 
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amidase derived from an organism referred to herein as 

Thermococcus GU5L5" which is a thermophilic archaeal 
organisir. which has a very high temperature optimum. The 
organism is strictly anaerobic and grows between 55 and 
5 90°C (optimally at 85°C) . GU5L5 was discovered in a 

shallow marine hydrothermal area in Vulcano, Italy. The 
organism has coccoid cells occurring in singlets or 
pairs. GU5L5 grows optimally at SS^'C and pH 6.0 in a 
marine medium with peptone as a substrate and nitrogen in 
10 gas phase. 

The polynucleotide of this invention was 
originally recovered from a genomic gene library derived 
from Thermococcus Gir5L5 as described below. It contains 
an open reading frame encoding a protein of 622 amino 
15 acid residues. 

In a preferred embodiment, the amidase enzyme of 
the present invention has a miolecular weight of about 
68.5 kilodaltons as inferred from the nucleotide sequence 
of the gene. 

20 In accordance with an aspect of the present 

invention, there are provided isolated nucleic acid 
molecules (polynucleotides) which encode f or ■ the mature 
enzyme having the deduced amino acid sequence of Figure 1 
(SEQ ID N0:2) . 

25 This invention, in addition to the isolated 

nucleic acid molecule encoding an amidase enzyme 
disclosed in Figure 1 (SEQ ID N0:1), also provides 
substantially similar sequences. Isolated nucleic acid 
sequences are substantially similar if: (i) they are 
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capable of hybridizing under stringent conditions, 
hereinafter described, to SEQ ID NO: I; or (ii) they 
encode DNA sequences which are degenerate to SEQ ID N0:1. 
Degenerate DNA sequences encode the amino acid sequence 
5 of SEQ ID N0:2, but have variations in the nucleotiae 
coding sequences* As used herein, "substantially 
similar" refers to the sequences having similar identity 
to the sequences of the instant invention. The ■ 
nucleotide sequences that are substantially similar can 
10 be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially similar can be 
identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing . 

One means for isolating a nucleic acid molecule 
15 encoding an amidase enzyme is to probe a gene library 
with a natural or artificially designed probe using art 
recognized procedures (see, for example: Current 
Protocols in Molecular Biology, Ausubei F.M. et ai . 
(EDS.) Green Publishing Company Assoc. and John Wiley 
20 Interscience, New York, 1989, 1992) . It is appreciated 
to one skilled in the art that SEQ ID NO:!, or fragments 
thereof (comprising at least 15 contiguous nucleotides), 
is a particularly useful probe. Other particular useful 
probes for this purpose are hybridizable fragm.ents to the 
25 sequences of SEQ ID N0:1 (i.e., comprising at least 15 
contiguous nucleotides) . 

With respect to nucleic acid sequences which 
hybridize to specific nucleic acid sequences disclosed 
herein, hybridization may be carried out under conditions 
30 of reduced stringency, medium stringency or even 

stringent conditions. As an example of oligonucleotide 
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hybridization, a polymer merrlDrane containing immobilized 
denatured nucleic acid is first prehybridi zed for 30 
minutes at 45°C in a solution consisting of 0.9 M NaCl, 
50 mM NaH2P04, pH 7.0, 5.0 irl^ Na^EDTA, 0.5^ SDS, 1 OX 
5 Denhardt's, and 0.5 mg/mL polyriboadenyl ic acid. 

Approximately 2 X 10' cpm (specific activity 4-9 X 10^ 
cpm/ug) of ^^P end-labeled oligonucleotide probe are then 
added to the solution. After 12-15 hours of incubation, 
the membrane is washed for 30 minutes at room temperature 
10 in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 
1 mM Na;,EDTA) containing 0.5^ SDS, followed by a 30 minute 
wash in fresh IX SET at Tm-lO'^C for the oligo-nucleot ide 
probe. The membrane is then exposed to auto-radiographic 
film for detection of hybridization signals. 

15 Stringent conditions means hybridization will 

occur only if there is at least 90% identity, preferably 
at least 95% identity and most preferably at least 91% 
identity between the sequences. See J. Sambrook et ai . , 
.Molecular Cloning, A Laboratory Manual (2d Ed. 1989) 

20 (Cold Spring Harbor Laboratory) which is hereby 
incorporated by reference in its entirety. 

"Identity" as the term is used herein, refers to a 
polynucleotide sequence which comprises a percentage of 
the same bases as a reference polynucleotide (SEQ ID 

25 N0:1). For example, a polynucleotide which is at least 
90% identical to a reference polynucleotide, has 
polynucleotide bases which are identical in 90% of the 
bases which make up the reference polynucleotide and may 
have different bases in 10% of the bases which comprise 

30 that polynucleotide sequence. 



wo 97/48794 



PCT/US97/09319 



- 10 - 

The present invention also relates to 
polynucleotides which differ froiTL rhe reference 
polynucleotide such that the changes are silent changes, 
for example the changes do not alter the amino acid 
5 sequence encoded by the polynucleotide. The present 

invention also relates to nucleotide changes which result 
in amino acid substitutions; addirions, deletions, 
fusions and truncations in the enzyme encoded by the 
reference polynucleotide (SEQ ID N0:1). In a preferred 
10 aspect of the invention these enzymes retain the same 
biological action as the enzyme encoded by the reference 
polynucleotide . 

It is also appreciated that such probes can be and 
are preferably labeled with an analytically detectable 

15 reagent to facilitate identification of the probe. 
Useful reagents include but are not limited to 
radioactivity/ fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The 
probes are thus useful to isolate complementary copies of 

20 DNA from other animal sources or to screen such sources 
for related sequences. 

The coding sequence for the am.idase enzyme of the 
present invention was identified by preparing a 
Thermococcus GU5L5 genomic DNA library and screening the 

25 library for the clones having am.idase activity. Such 

methods for constructing a genomic gene library are well- 
known in the art. One means, for example, comprises 
shearing DNA isolated from GU5L5 by physical disruption. 
A small amount of the sheared DNA is checked on an 

30 agarose gel to verify that the majority of the DNA is 'in 
the desired size range (approximately 3-6 kb) . The DNA 
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IS then blunt ended using Mung Bean Nuclease, incubated 
at 37^C and phenol/chloroform extracted. The DNA is then 
methylated using Eco RI Methylase. Eco Rl linkers are 
then ligated to the blunt ends through the use of T4 DNA 

5 ligase and incubation at 4°C. The ligation reaction is 
then terminated and the DNA is cut-back with Eco Rl 
restriction enzyme. The DNA is then size fractionated on 
a sucrose gradient following procedures known m the art, 
for example, Maniatis, T., et al , , Mpl^cqiar Cloning, 

D Cold Spring Harbor Press, New York, 1982, which is hereby 
incorporated by reference m its entirely. 

A plate assay is then performed to get an 
approximate concentration of the DNA. Ligation reactions 
are then performed and 1 ul of the ligation reaction is 

5 packaged to construct a library. Packaging, for exam.ple. 
may occur through the use of purified Agtll phage arms 
cut with EcoRI and DNA cut with EcoRl after attaching 
EcoRI linkers. The DNA and Xgtll arms are ligated with 
DNA ligase. The ligated DNA is then packaged into 

D infectious phage particles. The packaged phages are used 
to infect E, coli cultures and the infected cells are 
spread on agar plates to yield plates carrying thousands 
of individual phage plaques. The library is then 
amplified. 

5 Fragments of the full length gene of the present 

invention may be used as a hybridization probe for a cDNA 
or a genomic library to isolate the full length DNA and 
to isolate other DNAs which have a high sequence 
similarity to the gene or similar biological activity. 

0 Probes of this type have at least 10, preferably at least 
15, and even more preferably at least 30 bases and may 
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contain, for example, at least 50 or more bases. The 
probe may also be used to identify a DNA clone 
corresponding to a full length transcript and a genomic 
clone or clones that contain the complete gene including 
5 regulatory and promoter regions, exons, and introns. 

The isolated nucleic acid sequences and other 
enzymes may then be measured for retention of biological 
activity characteristic to the enzyme of the present 
invention, for example, in an assay for detecting 
10 enzymatic amidase activity. Such enzymes include 

truncated forms of am.idase, and variants such as deletion 
and insertion variants. 

The polynucleotide of the present invention may be 
in the form of DHA which DNA included cDNA, genomic DNA, 

15 and synthetic DNA. The DNA may be double-stranded or 

single-stranded, and if single stranded may be the coding 
strand or non-coding (anti-sense) strand. The coding 
sequence which encodes the mature enzyme may be identical 
to the coding sequence shown in Figure 1 (SEQ ID N0:1) 

20 and/or that of the deposited clone or may be a different 
coding sequence which coding sequence, as a result of the 
redundancy or degeneracy of the genetic code, encodes the 
same mature enzyme as the DNA of Figure 1 (SEQ ID N0:1) . 

The polynucleotide which encodes for the mature 
25 enzyme of Figure 1 (SEQ ID NO: 2) may include, but is not 
limited to; only the coding sequence for the mature 
enzyme; the coding sequence for the mature enzyme and 
additional coding sequence such as a leader sequence or a 
proprotein sequence; the coding sequence for the mature 
30 enzyme (and optionally additional coding sequence) and 
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non-coding sequence^ such as introns or non-coding 
sequence 5' and/or 3' of the coding sequence for the 
mature enzyme . 

Thus, the term "polynu ::leot ide encoding an enzyme 
5 (protein) " encompasses a polynucleotide which includes 
only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or 
non-coding sequence , 

The present invention further relates to variants 
10 of the hereinabove described polynucleotides which encode 
for fragments, analogs and derivatives of the enzyme 
having the deduced amino acid sequence of Figure 1 (SEQ 
ID NO: 2) . The variant of the polynucleotide may be a 
naturally occurring allelic variant of the polynucleotide 
15 or a non-naturally occurring variant of the 
polynucleotide . 

ThuS/ the present invention includes 
polynucleotides encoding the same mature enzyme as shown 
in Figure 1 (SEQ ID NO: 2) as well as variants of such 
20 polynucleotides which variants encode for a fragment, 
derivative or analog of the enzym.e of Figure 1 (SEQ ID 
N0;2). Such nucleotide variants include deletion 
variants, substitution variants and addition or insertion 
variants . 

25 As hereinabove indicated, the polynucleotide may 

have a coding sequence which is a naturally occurring 
allelic variant of the coding sequence shown in Figure 1 
(SEQ ID N0:1). As known in the art, an allelic variant 
is an alternate form of a polynucleotide sequence which 
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may have a subst i tut iori/ deletion or addition of one or 
more nucleotides/ which does not suostantially alter the 
function of the encoded enzyme. 

The present invention also includes 
5 polynucleotides/ wherein the coding sequence for the 

mature enzyme may be fused in the same reading frame to a 
polynucleotide sequence which aids in expression and 
secretion of an enzyme from a host cell, for example, a 
leader sequence which functions to control transport of 

10 an enzyme from the cell. The enzyme having a leader 

sequence is a preprotein and may have the leader sequence 
cleaved by the host cell to form the mature form of the 
enzyme. The polynucleotides may also encode for a 
proprotein which is the mature protein plus additional 5' 

15 amino acid residues. A mature protein having a 

prosequence is a proprotein and is an inactive form of 
the protein. Once the prosequence is cleaved an active 
mature protein remains. 

Thus, for example, the polynucleotide of the 
20 present invention may encode for a mature enzyme/ or for 
an enzyme having a prosequence or for an enzyme having 
both a prosequence and a presequence (leader sequence) . 

The present invention further relates to 
polynucleotides which hybridize to the hereinabove- 

25 described sequences if there is at least 70%, preferably 
at least 90%, and more preferably at least 95% identity 
between the sequences. The present invention 
particularly relates to polynucleotides which hybridize 
under stringent conditions to the hereinabove-described' 

30 polynucleotides. As herein used, the term "stringent 
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ccnditions" means hybridization will occur only if there 
is at least 95% and preferably at least 97?> identity 
between the sequences. The polynucleotides which 
hybridize to the hereinabove described polynucleotides in 
5 a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as 
the mature enzyme encoded by the DNA of Figure 1 (SEQ ID 
NO : 1 ) . 

Alternatively^ the polynucleotide may have at 
10 least 15 baseS/ preferably at least 30 bases, and more 
preferably at least 50 bases which hybridize to a 
polynucleotide of the present invention and which has an 
identity thereto, as hereinabove described/ and which may 
or may not retain activity. For example, such 
15 polynucleotides may be employed as probes for the 

polynucleotide of SEQ ID NO:!, for example, for recovery 
of the polynucleotide or as a PGR primer. 

Thus, the present invention is directed to 
polynucleotides having at least a 70% identity, 

20 preferably at least 90% identity and more preferably at 
least a 95% identity to a polynucleotide which encodes 
the enzyme of SEQ ID NO: 2 as well as fragments thereof, 
which fragments have at least 30 bases and preferably at 
least 50 bases and to enzymes encoded by such 

25 polynucleotides . 

The present invention further relates to a enzyme 
which has the deduced amino acid sequence of Figure 1 
(SEQ ID NO: 2), as well as fragments, analogs and 
derivatives of such enzyme. 
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The terms "fragment," "derivative" and "analog" 
when referring to the ensyiae of Figure 1 (SEQ ID NO: 2] 
means a enzyme which retains essentially the same 
biological function or activity as such enzyme. Thus, an 
5 analog includes a proprotein which can be activated by 
cleavage of the proprotein portion to produce an active 
mature enzyme. 

The enzyme of the present invention may be a 
recombinant enzyme, a natural enzyme or a synthetic 
10 enzyme, preferably a recombinant enzyme. 

The fragment , derivative or analog of the enzyme 
of Figure 1 (SEQ ID NO: 2) may be (i) one in which one or 
more of the amino acid residues are substituted with a 
conserved or non-conserved amino acid residue (preferably 

15 a conserved ammo acid residue) and such substituted 

amino acid residue may or may not be one encoded by the 
genetic code, or (ii) one in which one or more of the 
amino acid residues includes a substituent group, or 
(iii) one in which the mature enzyme is fused with 

20 another compound, such as a compound to increase the 
half-life of the enzyme (for example, polyethylene 
glycol), or (iv) one in which the additional amino acids 
are fused to the mature enzyme, such as a leader or 
secretory sequence or a sequence which is employed for 

25 purification of the mature enzyme or a proprotein 

sequence. Such fragments, derivatives and analogs are 
deemed to be within the scope of those skilled m the art 
from the teachings herein. 
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The enzyiv.es and poiynucleo cides of the present 
invention are preferably provided m an isolated form, 
and preferably are purified to homogeneity. 



The term "isolated" moans that the material is 
5 removed from its original environment (e.g., the natural 
environment if it is naturally occurring) . For example, 
a naturally-occurring polynucleotide or enzyme present in 
a living animal is not isolated, but the same 
polynucleotide or enzyme, separated from some or all of 
10 the coexisting materials in the natural system, is 

isolated. Such polynucleotides could be part of a vector 
and/or such polynucleotides or enzymes could be part of a 
com.position, and still be isolated m that such vector or 
composition is not part of its natural environment. 

15 The enzymes of the present invention include the 

enzyme of SEQ ID NO: 2 (in particular the mature enzyme) 
as well as enzymes which have at least 70% similarity 
(preferably at least 70% identity) to the enzyme of SEQ 
ID NO: 2 and more preferably at least 90% similarity (more 

20 preferably at least 90% identity) to the enzyme of SEQ ID 
N0:2 and still more preferably at least 95% similarity 
(still more preferably at least 95% identity) to the 
enzyme of SEQ ID NO: 2 and also include portions of such 
enzymes with such portion of the enzyme generally 

25 containing at: least 30 amino acids and more preferably at 
least 50 amino acids. 



As known in the art "similarity" between two 
enzymes is determined by comparing the amino acid 
sequence and its conserved amino acid substitutes of one 
30 enzyme to the sequence of a second enzyme. Similarity 
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may be determined by procedures which are weii-known m 
the arr, for example, a BLAST program (Basic Local 
Alignment Search Tool at the National Center for 
Biological Information) . 

5 A variant, i.e, a "fragmient", "analog" or 

"derivative" enzyme, and reference enzyme may differ in 
amino acid sequence by one or more substitutions, 
additions, deletions, fusions and truncations, which m^ay 
be present in any combination. 

10 Among preferred variants are those that vary from 

a reference by conservative amino acid substitutions. 
Such substitutions are those that substitute a given 
amino acid in a polypeptide by another amino acid of like 
characteristics. Typically seen as conservative 

15 substitutions are the replacements, one for another, 
among the aliphatic amino acids Ala, Val, Leu and lie; 
interchange of the hydroxyl residues Ser and Thr, 
exchange of the acidic residues Asp and Giu, substitution 
between the amide residues Asn and Gin, exchange of the 

20 basic residues Lys and Arg and replacements among the 
aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain 
the same biological function and activity as the 
reference polypeptide from which it varies. 

25 Fragments or portions of the enzymes of the 

present invention may be employed for producing the 
corresponding full-length enzyme by peptide synthesis; 
therefore, the fragments may be employed as intermediates 
for producing the full-length enzymes. Fragments or 
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portions of the poiynucieot ides of the present invention 
may be used to synthesize full-length polynucleotides of 
the present invention. 

The present invention also relates to vectors 
5 which include polynucleotides of the present invention, 
host celis which are genetically engineered with vectors 
of the invention and the production of enzymes of the 
invention by recombinant techniques. 

Host cells are genetically engineered (transduced 
10 or transformed or transfected) with the vectors 

containing the polynucleotides of this invention. Such 
vectors may be, for example, a cloning vector or an 
expression vector. The vector may be, for example, in 
the form of a plasmid, a viral particle, a phage, etc. 
15 The engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating 
promoters, selecting trans formants or amplifying the 
genes of the present invention. The culture conditions, 
such as temperature, pH and the like, are those 
20 previously used with the host cell selected for 

expression, and will be apparent to the ordinarily 
skilled artisan. 

The polynucleotides of the present invention may 
be employed for producing enzym.es by recombinant 

25 techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors 
for expressing an enzyme. Such vectors include 
chromosomal, nonchromosomal and synthetic DNA sequences, 
e.g., derivatives of SV40; bacterial plasmids; phage DNA; 

30 baculovirus; yeast plasmids; vectors derived from 
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combinations of plasmids and phage DNA, viral DNA such as 
vaccinia/ adenovirus/ fowl pox virus, and pseudorabi es . 
However, any other vector may be used as long as it is 
replicable and viable in the host. 

5 The appropriate DNA sequence may be inserted into 

the vector by a variety of procedures. In general, the 
DNA sequence is inserted into an appropriate restriction 
endonuclease site{s) by procedures known in the art. 
Such procedures and others are deemed to be within zhe 
10 scope of those skilled in the arc. 

The DNA sequence in the expression vector is 
operatively linked to an appropriate expression control 
sequence (s) (promoter) to direct mRNA synthesis. As 
representative examples of such promoters, there may be 

15 mentioned: LTR or SV40 promoter, the E. coll. lac or trp, 
the phage lambda promoter and other promoters known to 
control expression of genes in prokaryotic or eukaryotic 
cells or their viruses. The expression vector also 
contains a ribosome binding site for translation 

20 initiation and a transcription terminator. The vector 
may also include appropriate sequences for amplifying 
expression . 

In addition, the expression vectors preferably 
contain one or more selectable marker genes to provide a 
25 phenotypic trait for selection of transformed host cells 
such as dihydrof elate reductase or neomycin resistance 
for eukaryotic cell culture, or such as tetracycline or 
ampicillin resistance in E. coli. 

The vector containing the appropriate DNA sequence 
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as hereinabove described^ as well as an appropriate 
promoter or control sequence, may be empioyed to 
transform an appropriate host to permit the host to 
express the protein. 

5 As representative examples of appropriate hosts, 

there may be mentioned: bacterial cells, such as coll, 
Streptomyces, Bacillus subtllis; fungal cells, such as 
yeast; insect cells such as Drosophlla 52 and Spodoptera 
Sf9; animal cells such as CHO, COS or Bowes melanoma; 
10 adenoviruses; plant cells, etc. The selection of an 
appropriate host is deemed to be within the scope ci 
those skilled in the art from the teachings herein. 

More particularly, the present invention also 
includes recombinant constructs comprising one or more of 

15 the sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, 
into which a sequence of the invention has been inserted, 
in a forward or reverse orientation. In a preferred 
aspect of this embodiment, the construct further 

20 comprises regulatory sequences, including, for example, a 
promoter, operably linked to the sequence. Large numbers 
of suitable vectors and promoters are known to those of 
skill in the art, and are commercially available. The 
following vectors are provided by way of example; 

25 Bacterial: pQE70, pQE60, pQE-9 (Qiagen) , pBluescript II 
(Stratagene) ; pTRC99a, pKK223-3, pDR540, pRIT2T 
(Pharmacia); Eukaryotic: pXTl, pSG5 (Stratagene) pSVK3, 
pBPV, pMSG, PSVLSV40 (Pharmacia) . However, any other 
plasmid or vector may be used as long as they are 

30 replicable and viable in the host. 



wo 97/48794 



PCT/US97/09319 



- 22 - 

Promoter regions can be selected from any desired 
gene using CAT (chloramphenicol transferase) vectors or 
other vectors with selectable markers. Two appropriate 
vectors are pKK232-8 and pCM7 . Particular named 
5 bacterial promoters include laci, lacZ, T3, T7, gpt, 
lambda Pp, P^, and trp. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallo thlonein-I . 
Selection of the appropriate vector and promoter is well 
10 within the level of ordinary skill m the art. • 

In a further embodiment, the present invention 
relates to host cells containing the above-described 
constructs. The host cell can be a higher eukaryotic ^ 
ceil, such as a mammalian cell, or a lower eukaryotic 

15 cell, such as a yeast cell, or the host cell can be a 

prokaryotic cell, such as a bacterial ceil. Introduction 
of the construct into the host cell can be effected by 
calcium phosphate transf ection, DEAE-Dextran mediated 
transf ection, or electroporation (Davis, L., Dibner, M., 

20 Battey, I., Basic Methods in Molecular Biology, (1986)). 

The constructs m host cells can be used in a 
conventional manner to produce the gene product encoded 
by the recombinant sequence. Alternatively, the enzymes 
of the invention can be synthetically produced by 
25 conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian 
cells, yeast, bacteria, or other cells under the control 
of appropriate promoters. Cell-free translation systems 
can also be employed to produce such proteins using RNAs 
30 derived from the DNA constructs of the present invention. 



wo 97/48794 



PCT/US97/09319 



- 23 - 

Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by 
Sairjorook et ai . , Molecular Cloning: A Laboratory Manual, 
Second Edition, Cold Spring Harbor, N.Y., (1989), the 
5 disclosure of which is hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of 
the present invention by higher eukaryotes is increased 
by inserting an enhancer sequence into the vector. 
Enhancers are cis-acting elements of DNA, usually about 

10 from 10 to 300 bp that act on a promoter to increase its 
transcription. Examples include the SV40 enhancer on the 
late side of the replication origin bp 100 to 270, a 
cytomegalovirus early prom.oter enhancer, the polyoma 
enhancer on the late side of the replication origin, and 

15 adenovirus enhancers. 



Generally, recombinant expression vectors will 
include origins of replication and selectable markers 
permitting transformation of the host cell, e.g., the 
ampicillin resistance gene of E. coli and 5. cerevisiae 

20 TRPl gene, and a promoter derived from a highly-expressed 
gene to direcr transcription of a downstream structural 
sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate 
kinase (PGK) , a-factor, acid phosphatase, or heat shock 

25 proteins, among others. The heterologous structural 
sequence is assembled in appropriate phase with 
translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, zhe 

30 heterologous sequence can encode a fusion enzyme 

including an N-terminal identification peptide imparting 
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desired characteristics, e,g,, stabilization or 
simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are 
constructed by inserting a structural DNA sequence 
5 encoding a desired protein together with suitable 
translation initiation and termination signals m 
operable reading phase with a functional promoter. The 
vector will comprise one or more phenotypic selectable 
markers and an origin of replication to ensure . 

10 maintenance of the vector and to, if desirable, provide 
amplification within the host. Suitable prokaryotic 
hosts for transformation include coll, Bacillus 
subtilis r Salmonella typhimurium and various species 
within the genera Pseudomonas, S treptomyces , and 

15 Staphylococcus, although others may also be employed as a 
matter of choice. 

As a representative but nonlimiting example, 
useful expression vectors for bacterial use can comprise 
a selectable marker and bacterial origin of replication 

20 derived from commercially available plasmids comprising 
genetic elements of the well known cloning vector pBR322 
(ATCC 37017) . Such commercial vectors include, for 
example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, 
Sweden) and GEHl (Promega Biotec, Madison, WI, USA). 

25 These pBR322 "backbone" sections are combined with an 
appropriate promoter and the structural sequence to be 
expressed. 

Following transformation of a suitable host strain 
and growth of the host strain to an appropriate cell 
30 density, the selected promoter is induced by appropriate 
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means (e.g., temperaiure shift or chemical induction) and 
cells are cultured for an additional period. 

Cells are typically harvested by centrif ugation, 
disrupted by physical or chemical means, and the 
5 resulting crude extract retained for further 
purl f ication . 

Microbial cells employed in expression of proteins 
can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, 
10 or use of cell lysing agents, such methods are well known 
to those skilled in the art. 

Various mammalian cell culture systems can also be 
employed to express recombinant protein. Examples of 
mammalian expression systems include the COS-7 lines of 

15 monkey kidney fibroblasts, described by Gluzman, Cell, 

23:175 (1981), and other cell lines capable of expressing 
a compatible vector, for example, the CI27, 3T3, CHO, 
HeLa and BHK cell lines. Mammalian expression vectors 
will comprise an origin of replication, a suitable 

20 promoter and enhancer, and also any necessary ribosome 
binding sites, polyadenylation site, splice donor and 
acceptor sites, transcriptional termination sequences, 
and 5' flanking nontr anscr ibed sequences. DNA sequences 
derived from the SV40 splice, and polyadenylation sites 

25 may be used to provide the required nontranscr ibed 
genetic elements. 

The enzyme can be recovered and purified from 
recombinant cell cultures by methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion 
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or cation exchange chromarography / phosphocellulose 
chromatography, hydrophobic interaction chromatography, 
affinity chromatography, hydroxylapatite chromatography 
and lectin chromatography . Protein refolding steps can 
5 be used, as necessary, in completing configuration of the 
mature protein. Finally, high per f ormance 1 iquid 
chromatography (HPLC) can be employed for final 
puri f ication steps . 

The enzymes of the present invention may be a 
naturally purified product, or a product of chemical 
synthetic procedures, or produced by recombinant 
techniques from a prokaryotic or eukaryotic host (for 
example, by bacterial, yeast, higher plant, insect and, 
mammalian cells in culture) . Depending upon the host 
employed in a recombinant production -procedure, the 
enzymes of the present invention may be glycosylated or 
may be non-glycosylated. Enzymes of the invention may or 
may not also include an initial methionine amino acid 
residue . 

The enzymes, their fragments or other derivatives, 
or analogs thereof, or cells expressing them can be used 
as an immunogen to produce antibodies thereto. These 
antibodies can be, for example, polyclonal or monoclonal 
antibodies. The present invention also includes 
chimeric, single chain, and humanized antibodies, as well 
as Fab fragments, or the product of an Fab expression 
library. Various procedures known in the art may be used 
for the production of such antibodies and fragments. 

Antibodies generated against the enzymes 
corresponding to a sequence of the present invention can 
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be obtained by direct injection of the enzymes into an 
animal or by administering the enzymes to an animal, 
preferably a nonhuman. The antibody so obtained will 
then bind the enzymes itself. In this manner, even a 
5 sequence encoding only a fragment of the enzymes can be 
used to generate antibodies binding the whole native 
enzymes. Such antibodies can then be used to isolate the 
enzyme from cells expressing that enzyme. 

For preparation of monoclonal antibodies, any 
10 technique which provides antibodies produced by 

continuous cell line cultures can be used. Examples 
include the hybridoma technique {Kohier and Milstein, 
1975, Nature, 256:4 95-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et al . , 1983, 
15 Immunology Today 4:72) , and the EBV-hybr idoma technique 
to produce human monoclonal antibodies (Cole, et al . , 
1985, in Monoclonal Antibodies and Cancer Therapy, Alan 
R. Liss, Inc., pp. 77-96). 



Techniques described for the production of single 
20 chain antibodies (U.S, Patent 4,946,778) can be adapted 
to produce single chain antibodies to immunogenic enzyme 
products of this invention.. Also, transgenic mice may be 
used to express humanized antibodies to immunogenic " 
enzyme products of this invention. 



25 Antibodies generated against the enzyme of the 

present invention may be used in screening for similar 
enzymes from other organisms and samples. Such screening 
techniques are known in the art^ for example, one such 
screening assay is described in "Methods for Measuring 

30 Cellulase Activities", Methods in Enzymology, Vol 160, 
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pp. 87-116, v;hich is hereby incorporated by reference m 
its entirety. Antibodies may also be employed as a probe 
to screen gene libraries generated from this or other 
organisms to identify this or cross reactive activities. 

5 The term "antibody, " as used herein, refers to 

intact immLunoglobulin molecules, as well as fragments of 
immunoglobulin molecules, such as Fab, Fab', (Fab')^/ Fv, 
and SCA fragments, that are capable of binding to an 
epitope of an amidase polypeptide. These antibody 
10 fragments, which retain some ability to selectively bind 
to the antigen (e.g., an amidase antigen) of the antibody 
from which they are derived, can be made using well known 
methods in the art (see, e.g., Harlow and Lane, supra), 
and are described further, as follows. 

15 (1) A Fab fragment consists of a monovalent antigen- 
binding fragment of an antibody molecule, and can be 
produced by digestion of a whole antibody molecule with 
the enzyme papain, to yield a fragment consisting of an 
intact light chain and a portion of a heavy chain. 

20 (2) A Fab' fragment of an antibody molecule can be 
obtained by treating a whole antibody molecule with 
pepsin, followed by reduction, to yield a molecule 
consisting of an intact light chain and a portion of a 
heavy chain. Two Fab' fragments are obtained per 

25 antibody molecule treated in this manner. 

(3) A (Fab') 2 fragment of an antibody can be obtained by 
treating a whole antibody molecule with the enzyme 
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pepsin^ without subsequent reduction. A [Fah'). fragmenr 
is a dimer of two Fab' fragments, held together by two 
disulfide bonds. 

(4) An EV fragment is defined as a genetically engineered 
5 fragment containing the variable region of a light chain 

and the variable region of a heavy chain expressed as two 
chains . 

(5) A single chain antibody ("SCA") is a genetically 
engineered single chain molecule containing the variable 

10 region of a light chain and the variable region of a 
heavy chain, linked by a suitable, flexible polypeptide 
linker. 

As used in this invention, the term "epitope" 
refers to an antigenic determinant on an antigen, such as 

15 an amidase polypeptide, to which the paratope of an 
antibody, such as an amidase-specif ic antibody, binds, 
T^tigenic determinants usually consist of chemically 
active surface groupings of molecules, such as amino 
acids or sugar side chains, and can have specific three- 

20 dimensional structural characteristics, as well as 
specific charge characteristics . 

The present invention is further described with 
reference to the following examples; however, it is to be 
understood that the present invention is not limited to 
25 such examples. All parts or am.ounts, unless otherwise 
specified, are by weight. 
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In order r.c facilitate understanding of the 
following examples certain frequently occurring methods 
and/or rerir.s will be described. 



"Plasmids" are designated by a lower case p 
5 preceded and/or followed by capital letters and/or 
numbers. The starting plasmids herein are either 
commercially available, publicly available on an 
unrestricted basis^ or can be constructed from available 
plasmids in accord with published procedures. In 
10 addition, equivalent plasmids to those described are 

known in the art and will be apparent to the ordinarily 
skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of 
the DNA with a restriction enzyme that acts only at 

15 certain sequences in the DNA. The various restriction 
enzymes used herein are commercially available and their 
reaction conditions^ cofactors and other requirements 
were used as would be known to the ordinarily skilled 
artisan. For analytical purposes, typically 1 yg of 

20 plasmid or DNA fragment is used with about 2 units of 
enzyme in about 20 jil of buffer solution. For the 
purpose of isolating DNA fragments for plasmid 
construction, typically 5 no 50 lag of DNA are digested 
with 20 to 250 units of enzyme in a larger volume. 

25 Appropriate buffers and substrate amounts for particular 
restriction enzymes are specified by the manufacturer. 
Incubation times of about 1 hour at 37°c are ordinarily 
used, but may vary in accordance with the supplier's 
instructions. After digestion the reaction is 

30 electrophoresed directly on a polyacrylam.ide gel to 
isolate the desired fragment. 



wo 97/48794 



PCT/US97/09319 



- 31 - 

Size separation of the cleaved fragments is 
performed using 8 percent polyacrylamide gel described by 
Goeddel; D. et al.r Nucleic Acids Res.^ 8:4057 (1980). 



"Oligonucleotides" refers to either a single 
5 stranded polydeoxynucleo tide or two complementary 
polydeoxynucleotide strands which may be chemically 
synthesized. Such synthetic oligonucleotides may or may 
not have a 5' phosphate. Those that do not will not 
ligate to another oligonucleotide without adding a 
10 phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that 
has not been dephosphorylated . 

"Ligation" refers to the process of forming 
phosphodies ter bonds between two double stranded nucleic 
15 acid fragments (Maniatis et ai . , Id. ^ p. 146). Unless 
otherwise provided, ligation may be accomplished using 
known buffers and conditions with 10 units of T4 DNA 
ligase ("ligase") per 0,5 jig of approximately equimolar 
amounts of the DNA fragments to be ligated. 

20 Unless otherwise stated, transformation was 

performed as described in the method of Sambrook, Fritsch 
and Maniatus, 1989. 



Example 1 

Bacterial Expression and Purification of Amidase 

25 A Thermococcus GU5L5 genomic library was screened 

for amidase activity as described in Example 2 and a 
positive clone was identified and isolated. DNA of this 
clone was used as a template in a 100 \il PGR reaction 
using the following primer sequences: 
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5' primer: CCGAGAATTC ATTAAAGAGG AGAAATT.A^C TATGACCGGC 
ATCGAATGGA 3' (SEQ ID NO : 3 ) , 3' primer: 5' A?\TAAGGATC 
CACACTGGCA CAGTGTCAAG ACA 3' (SEQ ID N0:4). 

The protein was expressed in E, coll. The gene 
5 was amplified using PGR with the primers indicated above. 

Subsequent to amplification, the PGR product was 
cloned into the EcoRI and BamHl sites of pQETl and 
transformed by electroporation into E. coll M15(pREP4). 
The resulting trans formants were grown up in 3ml 
10 cultures, and a portion of this culture was induced. A 
portion of the uninduced and induced cultures were 
assayed using Z-L-Phe-AMC (see below) . 



The primer sequences set out above may also be 
employed to isolate the target gene from the deposited 
15 material by hybridization techniques described above. 



Example 2 

Discovery of an amidase from Thermococcus GU5L5 

Production of the expression gene banJc. 

Colonies containing pBluescript plasmids with 
20 random inserts from the organism Thermococcus GU5L5 was 
obtained according to the m.ethod of Hay and Short. (Hay, 
B. and Short, J., Strategies , 1992, 5, 16.) The 
resulting colonies were picked with sterile toothpicks 
and used to singly inoculate each of the wells of 96-well 
25 microtiter plates. The wells contained 250 uL of LB 
media with 100 ug/mL ampicillin, 80 pg/mL methicillin, 
and 10% v/v glycerol (LB Amp/Meth, glycerol) . The cells 
were grown overnight at 37''C without shaking. This 
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constituted generation of the "SourceGeneBank" ; each well 
of the Source GeneBank thus contained a stock culture of 
E. coll cells, each of which contained a pBluescript 
plasmid with a unique DNA insert. 

5 Screening for amidase activity. 

The plates of the Source GeneBank were used to 
multiply inoculate a single plate (the "Condensed Plate") 
containing in each well 200 ]iL of L3 Amp/Meth, glycerol. 
This step was performed using the High Density 

10 Replicating Tool (HDRT) of the Beckman Biomek with a 1% 
bleach, water, isopropanol; air-dry sterilization cycle 
in between each inoculation. Each well of the Condensed 
Plate thus contained 10 to 12 different pBluescript 
clones from each of the source library plates. The 

15 Condensed Plate was grown for 16h at 37°C and then used 
to inoculate two white 96-well Polyf iltronics microtiter 
daughter plates containing in each well 250 pL of LB 
Amp/Meth (without glycerol) . The original condensed 
plate was put in storage -80°C. The two condensed 

20 daughter plates were incubated at BV'C for 18 h. 

The '600 uM substrate stock solution' was prepared 
as follows: 25 mg of N~morphourea-L-phenylalanyl-7- 
amido-4-trif luoromethylcoumarin (Mu-Phe-AFC, Enzyme 
Systems Products, Dublin, CA) was dissolved in the 

25 appropriate volume of DMSO to yield a 25.2 mM solution. 
Two hundred fifty microliters of DMSO solution was' added 
to ca. 9 mL of 50 mM, pH 7.5 Hepes buffer containing 0.6 
mg/mL of dodecyl maltoside. The volume was taken to 10.5 
mL with the above Hepes buffer to yield a cloudy 

30 solution. 
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MU'Phe-AFC 

Fifty uL of the ^600 \M stock solution' was added 
to each of the wells of a white condensed plane using th*^ 
Biome !<: to yield a final concentration of substrate of 
5 ~ 1 GO uN . The fluorescence values were recorded 

(excitation = 400 nm, emission =^ 505 nm) on a plate 
reading f luorometer immediately after addition of the 
substrate* The plate was incubated at lO^'C for 60 min, 
and the fluorescence values were recorded again. The 
10 initial and final fluorescence values were subtracted to 
determine if an active clone was present by an increase 
in fluorescence over the majority of the other wells. 

Isolation of the active clone. 

In order to isolate the individual clone which 
15 carried the activity, the Source GeneBank plates were 

thawed and the individual wells used to singly inoculate 
a new plate containing LB Amp/Meth. As above the plate 
was incubated at 37°C to grow the cells, and 50 uL of 600 
\M substrate stock solution added using the Biomek. Once 
20 the active well from the source plate was identified, the 
cells from the source plate were used to inoculate 3mL 
cultures of LB/AMP/Meth, which were grown overnight. The 
plasmid DNA was isolated from the cultures and utilized 
for sequencing and construction of expression subclones. 

25 Example 3 

Thermococcus GU5L5 Amidase characterization 



Substrate specificity . 

Using the following . substrates (see below for 
definitions of the abbreviations) : CBZ-L-ala-AMC, CBZ-L- 
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arg-AiMC, CBZ-L-me t-AMC, CBZ-L-phe-AiMC , and 7-r.ethyl- 
umbeliif eryl heptanoate at lOO'oM for 1 hour at 70°C in 
the assays as described in the clone discovery section, 
the relative activity of tne ar.idase was 3:3:1:<0.1: <0.1 
5 for the coinpounds CBZ-L-arg-AHC : CBZ-L-phe-AMC : CDZ-L- 
met-AMC : CBZ-L-ala-AMC : 7-methylurrJDellif eryl 
heptanoate. The excitation and emission wavelengths for 
the 7-araido-4-methylcouinarins were 380 and 460 nm 
respectively, and 326 and 450 for the 
10 methylumbellif erone , 

The abbreviations stand for the following 
compounds : 

CBZ-L-ala-AMC = Na-carbonylbenzyloxy-L-a lanine- 7- 
amido- 4 -me thyl coumarin 
15 C3Z-L-arg-AMC = Na-carbonylbenzyloxy-L-arginine-7" 

amido-4-m.ethylcoumarin 

CBZ-D-arg-AMC = Na-carbonylbenzyloxy-D-arginine-7- 
amido-4-methylcoumarin 

CBZ-L-met-AMC = Na-carbonyibenzyloxy-L-methionine- 
2 0 •7-amido-4-methylcoumarin 

CBZ-L-phe-AMC = Na-carbonylbenzy lox y-L- 
phenylaianine-7-amido-4-methylcouir.arin 

Organic solvent sensitivity. 

The activity of the amidase in increasing 
25 concentrations of dimethyl sulfoxide (DMSO) was tested as 
follows: to each well of a microtiter plate was added 10 
ViL of 3 mM CBZ-L-phe-AHC in DMSO, 25 uL of cell lysate 
containing the amidase activity, and 250 pL of a variable 
mixture of DMSOrpH 7.5, 50 mM Hepes buffer. The 
30 reactions were heated for 1 hour at 70°c and the 
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fluorescence measured. Figure 2 shows the fluorescence 
versus concentration of DMSO. The filled and open boxes 
represent individual assays. 

The activity and enantiosel^. ctivi ty of the amidase 
5 in increasing concentrations of dimethyl formamide (DMF) 
was tested as follows: to each well of a microtiter 
plate was added 30 pL of 1 mM CBZ-L-arg-AMC or CB2-D-arg- 
AMC in DMF, 30 pL of cell lysate containing the amidase 
activity, and 240 pL of a variable mixture of DMF:pH 7,5, 

10 50 mM Hepes buffer. The reactiosn were incubated at RT 
for 1 hour and the fluorescence measured at 1 minute 
intervals. Figure 3 shows the relative initial linear 
rates (increase in fluorescence per min, i.e., 
'activity') versus concentration of DMF for the more 

15 reactive CBZ-L-arg-AMC . 

The initial linear rate (~ activity') of the L and 
the D CBZ-arg-AMC substrates are shown in Tables 1 and 2 
below : 



20 



Table 1 

Activity of the CBZ-L- 



DMF 


Initial 




Rate, 




Fl.U. /min 


0.4% 


654 


10% 


2548 


20% 


1451 


30% 


541 


40% 


345 



Table 2 

Activity of the CBZ-D- 
arg-AMC 



DMF 


Initial 




Rate, 




Fl .U. /min 


0.4% 


0.3 


10% 


10.1 


20% 


4.6 


30% 


1.8 


40% 


0.9 
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50% 303 50^ 1,2 

60% 190 60% 1 . 4 

75^ 81 75% 0^ 

90% li 90% 0.1 



5 The above data indicate that the enzyme shows 

excellent selectivity for the L, or 'natural' enantiomer 
of the derivatized amino acid substrate. 



Numerous modifications and variations of the 
present invention are possible in light of the above 
10 teachings and, therefore, within the scope of the 

appended claims, the invention may be practiced otherwise 
than as particularly described. 
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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CRARACTERISTICS 

(A) LENGTH: 16 69 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAK 

11) MOLECULE TYPE: DNA 

XI) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



^8 



96 



ATG ACC GGC ATC GAA TGG AAC CAC GAG ACC TTT TCT AAG TTC GCC TAG 
Met Thr Giy He Glu Trp Asn His Glu Thr Phe Sex Lys Phe Ala Tyr 
5 10 15 

CTG GGC GAC CCG AGG ATA CGG GGA AAC TTA ATC GCG TAG ACC CTG -ACG 
Leu Glv Asp Pro Arg He Arg Gly Asn Leu He Ala Tyr Thr Leu Thr 

20 25 30 

AAG GCC AAC ATG AAG GAC AAC AAG TAG GAG AGC ACG GTT GTT GTT GAA 14 4 

Lvs Ala Asn Met Lys Asp Asn Lys Tyr Glu Ser Thr Val Val Val Glu 
35 40 4S 

GAC CTT GAA ACG GGC TCA AGG CGC TTC ATC GAG AAC GCC TCA ATG CCG 192 
Asp Leu Glu Thr Gly Ser Arg Arg Phe He Glu Asn Ala Ser Met Pro 
50 55 60 

AGG ATT TCG CCA GAC GGC AGA AAG CTC GCC TTC ACC TGC TTT AAC GAG 2 AO 

Ara He Ser Pro Asp Gly Arg Lys Leu Ala Phe Thr Cys Phe Asn Glu 

e5 -70 75 80 

GAG AAG AAG GAG ACC GAG ATA TGG GTG GCC GAT ATC CAG ACC CTG AGC 28 8 

Glu Lys Lys Glu Thr Glu He Trp Val Ala Asp He Gin Thr Leu Ser 
85 90 95 

GCC AAG AAA GTC CTC TCA ACT AAA AAC GTC CGC TCG ATG CAG TGG AAC 33 6 

Ala Lys Lys Val Leu Ser Thr Lys Asn Val Arg Ser Met Gin Trp Asn 
100 105 110 

GAC GAT TCA AGG AGA CTC TTA GTT GTC GGC TTC AAG AGG AGG GAC GAT 38^ 
ASP Asp ser Arg Arg Leu Leu Val Val Gly Phe Lys Arg Arg Asp Asp 
115 120 H5 

GAG GAC TTC GTC TTT GAC GAC GAC GTC CCG GTC TGG TTC GAC AAT ATG 4 32 

Glu Asp Phe Val Phe Asp Asp Asp Val Pro Val Trp Phe Asp Asn Met 
130 135 140 

GGA TTC TTT GAT GGA GAG AAG ACG ACG TTC TGG GTT CTT GAC ACT GAG 480 
Glv Phe Phe Asp Gly Glu Lys Thr Thr Phe Trp Val Leu Asp Thr Glu 
145 150 155 160 

GCC GAG GAG ATA ATC GAG CAG TTC GAG AAG CCG AGG TTT TCG AGT GGC 528 
Ala Glu Glu He He Glu Gin Phe Glu Lys Pro Arg Phe Ser Ser Gly 
165 170 175 

CTC TGG CAC GGC GAT GCG ATA GTT GTG AAC GTC CCG CAC CGC GAG GGG 57 6 

Leu Trp His Gly Asp Ala He Val Val Asn Val Pro Kis Arg Glu Gly 
180 185 190 
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AGC AAG CCT GCC CTG TTC AAG TTC TAG GAC ATA GTC CTA TGG AAG GAC 62 4 

Ser Lys Pro Ala Leu Phe Lys Phe Tyr Asp lie Val Leu Trp Lys Asp 
195 200 205 

GGG GAG GAA GAG AAG CTC TTC GAG AGG GTC TCC TTC GAG GCG GTT GAC 67 2 

Gly Glu Glu Glu Lys Leu Phe Glu Arg Val Ser Phe Glu Ala Val Asp 
210 215 220 

TCC G^C GGA AAG AGA ATA CTC CTG AGG GGC AAG AAA AAA AAG CGG TTC 720 
Ser Asp Gly Lys Arg lie Leu Leu Arg Gly Lys Lys Lys Lys Arg Phe 
225 230 235 240 

ATC AGC GAG CAC GAC TGG CTG TAG CTC TGG GAC GGC GAG CTT AAA CCG 7 68 

lie Ser Glu His Asp Trp Leu Tyr Leu Trp Asp Gly Glu Leu Lys Pro 
245 250 255 

ATC TAG GAG GGC CCG CTC GAC GTC TGG GAA GCC AAG CTC ACG GAA GGA 816 
He Tyr Glu Gly Pro Leu Asp Val Trp Glu Ala Lys Leu Thr Glu Gly 
260 265 270 

AAG GTC TAG TTC CTC ACT CCA GAT GCG GGC AGG GTA AAC CTC TGG CTC 8 64 

Lys Val Tyr Phe Leu Thr Pro Asp Ala Gly Arg Val Asn Leu Trp Leu 
275 280 285 

TGG GAC GGG AAG GCC GAG CGT GTT GTT ACC GGC GAC CAC TGG ATT TAG 912 
Trp Asp Gly Lys Ala Glu Arg Val Val Thr Gly Asp His Trp He Tyr 

290 295 300 

GGG CTT GAC GTC AGC GAT GGC AAA GCA TTG CTC CTC ATC ATG ACC GCC 9 60 

Gly Leu Asp Val Ser Asp Gly Lys Ala Leu Leu Leu lie Met Thr Ala 
305 310 315 320 

ACG AGG ATA GGC GAG CTC TAC CTC TAG GAC GGC GAG CTG AAA CAG GTC 1008 

Thr Arg He Gly Glu Leu Tyr Leu Tyr Asp Gly Glu Leu Lys Gin Val 
325 330 335 

ACC GAA TAC AAC GGG CCG ATA TTC AGG AAG CTC AAG ACC TTC GAG CCG 1056 
Thr Glu Tyr Asn Gly Pro He Phe Arg Lys Leu Lys Thr Phe Glu Pro 
340 345 350 

AGG CAC TTC CGC TTC AAG AGC AAA GAC CTC GAG ATA GAC GGC TGG TAC 1104 
Arg Has Phe Arg Phe Lys Ser Lys Asp Leu Giu He Asp Gly Trp Tyr 
355 360 365 

CTC AGG CCG GAG GTT AAA GAG GAG AAG GCC CCG GTG ATA GTC TTC GTC • 1152 

Leu Arg Pro Glu Val Lys Glu Glu Lys Ala Pro Val He Val Phe Val 
370 375 380 

CAC GGC GGG CCG AAG GGC ATG TAC GGA CAC CGC TTC GTC TAC GAG ATG 1200 

His Gly Gly Pro Lys Gly Met Tyr Gly His Arg Phe Val Tyr Glu Met 
385 390 395 400 

CAG CTG ATG GCG AGC AAG GGC TAC TAC TGC TGC TTC GTG AAC CCG CGC 12 48 

Gin Leu Met Ala Ser Lys Gly Tyr Tyr Val Val Phe Val Asn Pro Arg 
405 410 415 

GGC AGC GAC GGC TAT AGC GAA GAC TTC GCG CTC CGC GTC CTG GAG AGG 1296 
Gly Ser Asp Gly Tyr Ser Glu Asp Phe Ala Leu Arg Val Leu Glu Arg 
420 425 430 
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ACT GGC TTG GAG GAC TTT GAG GAC ATA ATG AAC GGC ATC GAG GAG TTC 1344 
Thr Gly Leu Glu Asp Phe Glu Asp He Met Asn Giy He Giu Glu Phe 
435 445 

TTC AAG CTC GAA CCG CAG GGC GAC AGG GAG CGC GTT GGA ATA ACG GGC 13 92 

Phe Lys Leu Glu Pro Gin Ala Asp Arg Glu Arg Val Gly He Thr Gly 
A50 455 460 

ATA AGC TAC GGC GGC TTC ATG ACC AAC TGG GCC TTG AC? CAG AGC GAC 1440 

He Ser Tyr Gly Gly Phe Met Thr Asn Trp Ala Leu Thr Gin Ser Asp 
465 470 475 480 

CTC TTC AAG GCA GGA ATA AGC GAG AAC GGC ATA AGC TAC TGG CTC ACC 14 8 8 

Leu Phe Lys Ala Gly He Ser Glu Asn Gly He Ser Tyr Trp Leu Thr 
485 490 495 • 

AGC TAC GCC TTC TCG GAC ATA GGG CTC TGG TAC GAC GTC GAG GTC ATC 1536 
ser Tyr Ala Phe Ser Asp He Gly Leu Trp Tyr Asp Val Glu Val He 
500 505 510 

GGG CCA AAT CCG TTA GAG AAC GAG AAC TTC AGG AAG CTC AGC CCG CTG 1584 
Gly Pro Asn Pro Leu Glu Asn Glu Asn Phe Arg Lys Leu Ser Pro Leu 
515 520 525 

TTC TAC GCT CAG AAC GTG AAG GCG CCG ATA CTC CTA ATC CAC TCG CTT 1632 
Phe Tyr Ala Gin Asn Val Lys Ala Pro He Leu Leu He His Ser Leu 
530 535 . 540 

GAG GAC TAC CGC TGT CCG CTC GAC CAG AGC CTT ATG TTC TAC AAC GTG 1680 

Glu Asp '^yr A^g Cys Pro Leu Asp Gin Ser Leu Met Phe Tyr Asn Val 
545 550 555 560 

CTC AAG GAC ATG GGC AAG GAA GCC TAC ATA GCG ATA TTC AAG CGC GGC 1728 

Leu Lvs Asp Met Gly Lys Glu Ala Tyr He Ala He Phe Lys Arg Gly 
565 570 575 

GCG CAC GGC CAC AGC GTC CGC GGA AGC CCG AGG CAC AGG CCG AAG CGC 177 6 

Ala His Gly His Ser Val Arg Gly Ser Pro Arg His Arg Pro Lys Arg 
580 585 590 

TAC AGG CTC TTC ATA GAG TTC TTC GAG CGC AAG CTC AAG AAG TAC GAG 182^ 
Tyr Arg Leu Phe He Glu Phe Phe Glu Arg Lys Leu Lys Lys Tyr Glu 
595 600 605 

GAG GGC TTT GAG GTA GAG AAG ATA CTC AAG GGG AAT GGG AAC TGA 18 69 

Glu Gly Phe Glu Val Glu Lys He Leu Lys Gly Asn Giy Asn 
610 615 620 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 622 AMINO ACIDS 

(B) TYPE: AMINO ACID 

(C) STRANDEDNESS : 

(D) TOPOLOGY: LINEAR 

(ill MOLECULE TYPE: PROTEIN 
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(Xi) SEQUENCS DESCRIPTION: SEQ ID NO : 2 : 

Met Thr Gly lie Glu Trp Asn His Glu Thr Phe Ser Lys Phe Ala Tyr 
5 iO i5 

Leu Gly Asp Pro Arg He Arg Gly Asn Leu lie Ala Tyr Thr Leu Thr 
20 25 30 

Lys Ala Asn Met Lys Asp Asn Lys Tyr Glu Ser Thr Val Val Val Giu 
35 ^0 ^5 

Asp Leu Glu Thr Gly Ser Arg Arg Phe lie Giu Asn Ala Ser Met Pro 
50 55 60 

Arg He Ser Pro Asp Gly Arg Lys Leu Ala Phe Thr Cys Phe Asn Glu 
65 70 75 80 

Glu Lys Lys Glu Thr Glu He Trp Vai Ala Asp He Gin Thr Leu Ser 
85 90 95 

Ala Lys Lys Val Leu Ser Thr Lys Asn Val Arg Ser Met Gin Trp Asn 
100 105 110 

Asp Asp Ser Arg Arg Leu Leu Val Val Gly Phe Lys Arg Arg Asp Asp 
115 120 125 

Glu Asp Phe Val Phe Asp Asd Asp Val Pro Val Trp Phe Asp Asn Met 
130 135 140 

Gly Phe Phe Asp Gly Glu Lys Thr Thr Phe Trp Val Leu Asp Thr Glu 
145 ISO 155 160 

Ala Glu Glu He He Glu Gin Phe Glu Lys Pro Arg Phe Ser Ser Gly 
165 170 lib 

Leu Trp His Gly Asp Ala He Val Val Asn Val Pro His Arg Glu Gly 
180 185 190 

Ser Lys Pro Ala Leu Phe Lys Phe Tyr Asp He Val Leu Trp Lys Asp 
195 200 205 

Gly Glu Glu Glu Lys Leu Phe Giu Arg Val Ser Phe Glu Ala Val Asp 

215 220 



210 



Ser Asp Gly Lys Arg He Leu Leu Arg Gly Lys Lys Lys Lys Arg Phe 
225 230 235 240 

He ser Glu His Asp Trp Leu Tyr Leu Trp Asp Gly Giu Leu Lys Pro 
215 250 255 

He Tyr Glu Gly Pro Leu Asp Val Trp Glu Ala Lys Leu Thr Glu Gly 
260 265 270 

Lys Val Tyr Phe Leu Thr Pro Asp Ala Gly Arg Val Asn Leu Trp Leu 
275 280 285 

Trp Asp Gly Lys Ala Glu Arg Val Vai Thr Gly Asp His Trp He Tyr 
290 295 300 

Gly Leu Asp Val Ser Asp Gly Lys Ala Leu Leu Leu He Met Thr Ala 
305 310 315 320 
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Thr Arg He Gly Glu Leu Tyr Leu Tyr Asp Gly Glu Leu Lys Gin Vai 
325 330 335 

Thr Glu Tyr Asn Gly Pre He Phe Arg Lys Leu lys Thr Phe Glu Pro 
340 345 350 

Arg His Phe Arg Phe Ly^ Ser Lys Asp Leu Glu He Asp Gly Trp Tyr 
355 360 365 

Leu Arg Pro Go-u Val Lys Glu Glu Lys Ala Pro Val He Va I Phe Val 
370 375 380 

Has Gly Gly Pro Lys Gly Met Tyr Gly His Arg Phe Val Tyr Glu Met 
385 390 395 400 

Gin Leu Met Ala Ser Lys Gly Tyr Tyr Val Val Phe Val Asn Pro Arg 
405 410 415 

Gly Ser Asp Gly Tyr Ser Glu Asp Phe Ala Leu Arg Val Leu Glu Arg 
420 425 430 

Thr Gly Leu Glu Asp Phe Glu Asp lie Met Asn Gly He Glu Glu Phe 
435 440 445 

Phe Lys Leu Glu Pro Gin Ala Asp Arg Glu Arg Val Gly He Thr Gly 
450 455 460 

He Ser Tyr Gly Gly Phe Met Thr Asn Trp Ala Leu Thr Gin Ser Asp 
455 470 475 480 

Leu Phe Lys Ala Gly He Ser Glu Asn Gly He Ser Tyr Trp Leu Thr 
485 490 495 

Ser Tyr Ala Phe Ser Asd He Gly Leu Trp Tyr Asp Val Glu Val He 

500 505 510 

Gly Pro Asn Pro Leu Glu Asn Glu Asn Phe Arg Lys Leu Ser Pro Leu 
515 520 525 

Phe Tyr Ala Gin Asn Val Lys Ala Pro He Leu Leu He His Ser Leu 
530 535 540 

Glu Asp Tyr Arg Cys Pro Leu Asp Gin Ser Leu Met Phe Tyr Asn Vai 
545 550 555 560 

Leu Lys Asp Met Gly Lys Glu Ala Tyr He Ala He Phe Lys Arg Gly 
565 570 575 

AJ-a His Gly His Ser Val Arg Gly Ser Pro Arg His Arg Pro Lys Arg 
580 585 590 

Tyr Arg Leu Phe He Glu Phe Phe Glu Arg Lys Leu Lys Lys Tyr Glu 
595 600 605 

Glu Gly Phe Glu Val Glu Lys He Leu Lys Gly Asn Gly Asn 
610 615 620 
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(2) INFORMATION FOR SEQ ID NO: 3: 

ii) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 50 NUCLEOTIDES 
{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAK 

(ii) MOLECULE TYPE: Oi i gonucieot \ de 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCGAGAATTC ATTAAAGAGG AGAAATTAAC TATGACCGGC ATCGAATGGA 50 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 3 3 NUCLEOTIDES 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE; Oligonucleotide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 ; 



AATAAGGATC CACACTGGCA CAGTGTCAAG ACA 



33 
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What Is Claimed Is: 

1. An isolated polynucleotiae which encodes the amino 
acid sequence set: forth in SEQ ID NO: 2. 

2. An isolated polynucleotide selected from the group 
consisting of: 

a) SEQ ID N0:1; 

b) SEQ ID INrO:l, wherein T can also be U; 

c) nucleic acid sequences complementary to a) and b) ; 
and 

d) fragments of a), b), or c) that are at least 15 
bases in length and that will hybridize to DNA 
which encodes the amino acid sequence of SEQ ID 
NO : 2 . 

3. The polynucleotide of claim 1, wherein the polynu- 
cleotide is isolated from a prokaryote. 

■4, An expression vector including the polynucleotide 

of claim 1. 

5. The vector of claim 4, wherein the vector is a 
plasmid. 

6. The vector of claim 4, wherein the vector is a 
virus-derived. 

7. A host cell transformed with the vector of claim 
4 . 
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8. The host cell of claim 1, wherein the cell is 
prokaryotic , 

9. The polynucleotide of claim 1 which encodes the 
enzyme coiriprising amino acid 1 to 622 of SEQ ID 
N0:2. 

10. The polynucleotide of claim 1 comprising the 
sequence as set forth in SEQ ID N0:1 from 
nucleotide 1 to nucleotide 1866. 

11. A substantially pure polypeptide selected from the 
group consisting of: 

a) an enzyme comprising an amino acid sequence 
which is at least 70% identical to the amino 
acid sequence set forth in SEQ ID N0:2; 

b) an enzyme which comprises at least 30 amino 
acid residues to the enzyme of a) ; and 

c) the amino acid sequence as set forth in SEQ 
ID N0:2. 

12. Antibodies that bind to the polypeptide of claim 
11, 

13. The antibodies of claim 12, wherein the antibodies 
are polyclonal. 

14. The antibodies of claim 12, wherein the antibodies 
are monoclonal. 
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15. A method for proaucing an enzyme comprisinq 
growing a host cell of claim 7 under conditions 
which allow the expression of the nucleic acid and 
isolating the enzyme encoded by the nucleic acid. 

16. A process for producing a recombinant cell 
comprising transforming or transfecting the cell 
with the vector of claim 4 such that the cell 
expresses a polypeptide encoded by the DNA 
contained in the vector. 

17. A process for removal of arginine phenylalanine or 
methionine from the N-terminal end of peptides in 
peptide or peptidomimet ic synthesis, comprising: 
administering an amount of the enzyme of claim 10 
effective for removal of arginine phenylalanine or 
methionine from the N-terminal end of peptides in 
peptide or peptidomimet ic synthesis. 
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Figure 1 

Thermococcus GU5L5 Amidase 



1 ATG ACC GGC ATC GAA TGG AAC CAC GAG ACC TTT TCT AAG TTC GCC TAG 

CTG GGC GAG CCG 6 0 

1 Met Thr Gly He Glu Trp Asn His Glu Thr Phe Ser Lys Phe Ala Tyr 

Leu Gly Asp Pro 20 



61 AGG ATA CGG GGA AAC TTA ATC 
AAG GAC AAC AAG 12 0 

21 Arg He Arg Gly Asn Leu He 
Lys Asp Asn Lys 4 0 

121 TAC GAG AGC ACG GTT GTT GTT 
TTC ATC GAG AAC 180 

41 Tyr Glu Ser Thr Val Val Val 
Phe He Glu Asn 60 

181 GCC TCA ATG CCG AGG ATT TCG 
TGC TTT AAC GAG 24 0 

61 Ala Ser Met Pro Arg He Ser 
Cys Phe Asn Glu 3 0 



GCG TAC ACC CTG ACG AAG GCC AAC ATG 
Ala Tyr Thr Leu Thr Lys Ala Asn Met 

GAA GAC CTT GAA ACG GGC TCA AGG CGC 
Glu Asp Leu Glu Thr Gly Ser Arg Arg 

CCA GAC GGC AGA AAG CTC GCC TTC ACC 
Pro Asp Gly Arg Lys Leu Ala Phe Thr 



241 GAG AAG AAG GAG ACC GAG ATA TGG GTG GCC GAT ATC CAG ACC CTG AGC 

GCC AAG AAA GTC 300 

81 Glu Lys Lys Glu Thr Glu He Trp Val Ala Asp He Gin Thr Leu Ser 

Ala Lys Lys Val 100 



3 01 CTC TCA ACT AAA AAC GTC CGC TCG ATG CAG TGG AAC GAC GAT TCA AGG 

AGA CTC TTA GTT 36 0 

101 Leu Ser Thr Lys Asn Val Arg Ser Met Gin Trp Asn Asp Asp Ser Arg 

Arg Leu Leu Val 12 0 
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361 GTC GGC TTC AAG 
GTC CCG GTC TGG 42 0 

12 1 Val Gly Phe Lys 
Val Pro Val Trp 140 

421 TTC GAC AAT ATG 
CTT GAC ACT GAG 48 0 

141 Phe Asp Asn Met 
Leu Asp Thr Glu 16 0 

481 GCC GAG GAG ATA 
CTC TGG CAC GGC 54 0 

161 Ala Glu Glu He 
Leu Trp His Gly 180 

541 GAT GCG ATA GTT 

CTG TTC AAG TTC 6 00 

181 Asp Ala He Val 

Leu Phe Lys Phe 200 

6 01 TAG GAC ATA GTC 
AGG GTC TCC TTC 66 0 

201 Tyr Asp He Val 
Arg Val Ser Phe 22 0 

661 GAG GCG GTT GAC 

AAA AAG CGG TTC 72 0 

221 Glu Ala Val Asp 

Lys Lys Arg Phe 24 0 

721 ATC AGC GAG CAC 
ATC TAC GAG GGC 780 

241 He Ser Glu His 
He Tyr Glu Gly 260 



AGG AGG GAC GAT GAG GAC 
Arg Arg Asp Asp Glu Asp 

GGA TTC TTT GAT GGA GAG 
Gly Phe Phe Asp Gly Glu 

ATC GAG CAG TTC GAG AAG 
He Glu Gin Phe Glu Lys 

GTG AAC GTC CCG CAC CGC 
Val Asn Val Pro His Arg 

CTA TGG AAG GAC GGG GAG 
Leu Trp Lys Asp Gly Glu 

TCC GAC GGA AAG AGA ATA 
Ser Asp Gly Lys Arg He 

GAC TGG CTG TAC CTC TGG 
Asp Trp Leu '^yr Leu Trp 



TTC GTC TTT GAC GAC GaC 
Phe Val Phe Asp Asp Asp 

AAG ACG ACG TTC TGG GTT 
Lys Thr Thr Phe Trp Val 

CCG AGG TTT TCG AGT GGC 
Pro Arg Phe Ser Ser Gly 

GAG GGG AGC AAG CCT GCC 
Glu Gly Ser Lys Pro Ala 

GAA GAG AAG CTC TTC GAG 
Glu Glu Lys Leu Phe Glu 

CTC CTG AGG GGC AAG AAA 
Leu Leu Arg Gly Lys Lys 

GAC GGC GAG CTT AAA CCG 
Asp Gly Glu Leu Lys Pro 
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761 CCG CTC GAC GTC TGG GAA GCC 
CTC ACT CCA GAT 94 0 

261 Pro Leu Asp Val Trp Glu Ala 
Leu Thr Pro Asp 280 

841 GCG GGC AGG GTA AAC CTC TGG 
GTT ACC GGC GAC 900 

281 Ala Gly Arg Val Asn Leu Trp 
Val Thr Gly Asp 3 00 

901 CAC TGG ATT TAG GGG CTT GAC 
ATC ATG ACC GCC 96 0 

301 His Trp lie Tyr Gly Leu Asp 
He Met Thr Ala 320 

961 ACG AGG ATA GGC GAG CTC TAG 
ACC GAA TAC AAC 1020 

321 Thr Arg He Gly Glu Leu Tyr 
Thr Glu Tyr Asn 340 

1021 GGG CCG ATA TTC AGG AAG CTC 
TTC AAG AGC AAA 108 0 

341 Gly Pro He Phe Arg Lys Leu 
Phe Lys Ser Lys 36 0 

1081 GAC CTC GAG ATA GAC GGC TGG 
AAG GCC CCG GTG 1140 

361 Asp Leu Glu He Asp Gly Trp 
Lys Ala Pro Val 380 

1141 ATA GTC TTC GTC CAC GGC GGG 
GTC TAC GAG ATG 12 0 0 

381 He Val Phe Val His Gly Gly 
Val Tyr Glu Met 400 

1201 CAG CTG ATG GCG AGC AAC GGC 
GGC AGC GAC GGC 126 0 

401 Gin Leu Met Ala Ser Lys Gly 
Gly Ser Asp Gly 420 
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AAG CTC ACG GA.^ GGA AAG GTC TAC TTC 
Lys Leu Thr Glu Gly Lys Val Tyr Phe 

CTC TGG GAC GGG AAG GCC GAG CGT HTT 
Leu Trp Asp Gly Lys Ala Glu Arg Val 

GTC AGC GAT GGC AAA GCA TTG CTC CTC 
Val Ser Asp Gly Lys Ala Leu Leu Leu 

CTC TAC GAC GGC GAG CTG AAA CAG GTC 
Leu Tyr Asp Gly Glu Leu Lys Gin vil 

AAG ACC TTC GAG CCG AGG CAC TTC CGC 
Lys Thr Phe Glu Pro Arg His Phe Arg 

TAC CTC AGG CCG GAG GTT AAA GAG GAG 
Tyr Leu Arg Pro Glu Val Lys Glu Glu 

CCG AAG GGC ATG TAC GGA CAC CGC TTC 
Pro Lys Gly Met Tyr Gly His Arg Phe 

TAC TAC GTC GTC TTC GTG AAC GCG CGC 
Tyr Tyr Val Val Phe Val Asn Pro Arg 



wo 97/48794 



PCT/US97/09319 



4/6 

1261 TAT AGO GAA GAC TTC GCG CTC CGC GTC CTG GAG AGG ACT GGC TTG GAG 
GAG TTT GAG GAC 13 2 0 

421 Tyr Ser Glu Asp Phe Ala Leu Arg Val Leu Glu Arg Thr Gly Leu Glu 
Asp Phe Glu Asp 44 0 



13 21 ATA ATG AAC GGC ATC GAG GAG TTC TTC AAG CTC GAA CCG CAG GCC GAC 
AGG GAG CGC GTT 13 8 0 

441 He Met Asn Gly He Glu Glu Phe Phe Lys Leu Glu Pro Gin Ala Asp 
Arg Glu Arg Val 46 0 



13 81 GGA ATA ACG GGC ATA AGC TAG GGC GGC TTC ATG ACC AAC TGG GCC TTG 
ACT CAG AGC GAC 144 0 

461 Gly He Thr Gly He Ser Tyr Gly Gly Phe MeC Thr Asn Trp Ala Leu 
Thr Gin Ser Asp 480 



1441 CTC TTC AAG GCA GGA ATA AGC GAG AAC GGC ATA AGC TAG TGG CTC AC?C 
AGC TAG GCC TTC 150 0 

481 Leu Phe Lys Ala Gly He Ser Glu Asn Gly He Ser Tyr Trp Leu Thr 
Ser Tyr Ala Phe 500 



1501 TCG GAC ATA GGG CTC TGG TAG GAC GTC GAG GTC ATC GGG CCA AAT CCG 
TTA GAG AAC GAG 156 0 

501 Ser Asp He Gly Leu Trp Tyr Asp Val Glu Val He Gly Pro Asn Pro 
Leu Glu Asn Glu 520 



1561 AAC TTC AGG AAG CTC AGC CCG CTG TTC TAG GCT CAG AAC GTG AAG GCG 

CCG ATA CTC CTA 162 0 

521 Asn Phe Arg Lys Leu Ser Pro Leu Phe Tyr Ala Gin Asn Val Lys Al-i 

Pro He Leu Leu 540 



16 21 ATC CAC TCG CTT GAG GAC TAC CGC TGT CCG CTC GAC CAG AGC CTT ATG 
TTC TAC AAC GTG 16 80 

541 He His Ser Leu Glu Asp Tyr Arg Cys Pro Leu Asp Gin Ser Leu Met 
Phe Tyr Asn Val 56 0 



1681 CTC AAG GAC ATG GGC AAG GAA GCC TAC ATA GCG ATA TTC AAG CGC GGC 
GCC CAC GGC CAC 174 0 
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S61 Leu Lys Asp Mec 
Ala His Gly Kis 580 

1741 AGC GTC CGC GGA 

ATA GAG TTC TTC 18 0 0 

581 Ser Val Arg Gly 

He Glu Phe Phe 600 

18 01 GAG CGC AAG CTC 
CTC AAG GGG AAT 1860 

601 Glu Arg Lys Leu 
Leu Lys Gly Asn 62 0 
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Gly Lys Glu Ala Tyr He 

AGC CCG AGG CAC AGG CCG 
Ser Pro Arg Fis Arg Pro 

AAG AAG TAG GAG GAG GGC 
Lys Lys Tyr Glu Glu Gly 



Ala lie Phe Lys Arg Gly 

AAG CGC TAC AGG CTC TTC 
Lys Arg Tyr Arg Leu Phe 

TTT GAG GTA GAG AAG ATA 
Phe Glu. Val Glu Lys He 



1861 GGG AAC TGA 1869 
621 Gly Asn End 623 
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Activity of GU5L5 Amldase with 
CBZ-Phe-AMC vs DMSO 




% DMSO 



Figure 2 
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Figure 3 
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